Adding support for ARB (Application Resource Bundle) (.arb) format #338

dsavinov-actionengine · 2024-05-24T14:45:38Z

Problem and/or solution

Adding parsing and compiling for ARB (Application Resource Bundle) (.arb) format

How to test

1. Running unit-tests

/openformats/tests/formats/arb/test_arb.py contains tests for arb
Use pytest /openformats/tests/formats/arb/test_arb.py to run tests

2. Through testbed

Use "ARB" handler in the testbed

Reviewer checklist

Code:

Change is covered by unit-tests
Code is well documented, well styled and is following best practices
Performance issues have been taken under consideration
Errors and other edge-cases are handled properly

PR:

Problem and/or solution are well-explained
Commits have been squashed so that each one has a clear purpose
Commits have a proper commit message according to TEM

dsavinov-actionengine · 2024-05-24T14:46:14Z

Tagging @kbairak for review, but feel free to tag others if needed

kbairak

Great job!

kbairak · 2024-06-12T06:48:36Z

openformats/formats/json.py

+                    self.existing_keys.add(key)
+
+                if isinstance(value, DumbJson):
+                    self._find_keys(value, key)


Instead of going infinitely deep, we could instead search up to level two and collect entries that are non-text. This way we don't have to bother with escaping keys etc. Here is an idea; I haven't tested it so there might be a few errors:

class ArbHandler(JsonHandler): ... def parse(self, content, **kwargs): ... keys = self._find_keys(parsed) stringset = self._extract(parsed, keys) def _find_keys(self, parsed): if parsed.type != dict: raise ParseError("...") keys, keys_to_ignore = set(), set() for key, key_position, value, _ in parsed: if not key.startswith("@"): if not isinstance(value, str): continue if key in keys: raise ParseError( f"Duplicate string key ('{key}') in line " f"{self.transcriber.line_number}" ) keys.add(key) else: if isinstance(value, DumbJson): (inner_value, _), = value.find_children('type') if inner_value != "text": keys_to_ignore.add(key) return keys - keys_to_ignore def _extract(self, parsed, keys): stringset = [] for key, _, value, value_position in parsed: if key not in keys: continue ... return stringset

Did not try this. Anyway, ARBs normally have only two levels of json, so we don't expect infinite nesting.

kbairak · 2024-06-12T06:58:31Z

openformats/formats/json.py

+        else:
+            raise ParseError("Invalid JSON")
+
+    def compile(self, template, stringset, **kwargs):


So, the only reason you re-define compile and _copy_until_and_remove_section is so that you can add the keep_sections functionality, which is only needed for testing. I am not very fond of this idea. It makes the code unnecessarily complex. However, there is a case where not removing the sections makes sense.

This is not adequately described in the docs, but when Transifex invokes this handler, it will pass is_source as a keyword argument. In this case, we can be certain that

the stringset will not be missing anything and that

the user wants to get the exact same file they previously uploaded

So, if this is indeed the same use-case, maybe you can rename keep_sections to is_source.

In any case, lets make sure that when is_source=True is passed to compile, the end result is the same as the originally uploaded file.

Tried this, replaced keep_sections with is_source, but that did not work, unit-tests failed. Anyway, re-definition of the compile() method is necessary now, as it handles language code for ARB.

kbairak · 2024-06-12T06:59:43Z

openformats/formats/json.py

+            raise ParseError("Invalid JSON")
+
+    def _extract(self, parsed):
+        if parsed.type == dict:


No need for this check, we already did it in _find_keys. Perhaps to make it clearer, lets make one check in parse and then in _find_keys and _extract assume that parsed represents a dict.

kbairak · 2024-06-12T10:01:15Z

There is another issue I am concerned about. Looking at the examples of the official Flutter documentation, it looks like the language files are supposed to be more bare: only key-translation pairs without any metadata.

So, the question is: should we follow this? should we make sure that when is_source=False during compile, only the key-translation pairs should be present while when is_source=True we should make sure that they get back the same file they originally uploaded?

Have you looked into this?

dsavinov-actionengine · 2024-06-21T14:14:36Z

There is another issue I am concerned about. Looking at the examples of the official Flutter documentation

...

should we make sure that when is_source=False during compile, only the key-translation pairs should be present while when is_source=True we should make sure that they get back the same file they originally uploaded?
Have you looked into this?

Did not implement this logic in the latest commit. It looks unusual if compared to other openformats handlers, which have their compiled files as similar as possible to the source files (only strings from stringset change). Maybe we could try to get back to this idea in a separate pull-request.

dsavinov-actionengine · 2024-06-21T14:15:55Z

openformats/formats/json.py

+                    continue
+
+                context_key = f"@{key}.context"
+                context_value = self.metadata[context_key] \


Implemented issue 1 from QA spreadsheet

dsavinov-actionengine · 2024-06-21T14:16:15Z

openformats/formats/json.py

+        )
+        new_template = self._clean_empties(new_template)
+
+        if language_info is not None:


Implemented issue 2 from QA spreadsheet

dsavinov-actionengine · 2024-06-21T14:16:29Z

openformats/formats/json.py

+                context_value = self.metadata[context_key] \
+                    if context_key in self.metadata.keys() else ""
+                description_key = f"@{key}.description"
+                description_value = self.metadata[description_key] \


Implemented issue 3 from QA spreadsheet

kbairak requested changes Jun 12, 2024

View reviewed changes

dsavinov-actionengine force-pushed the support_arb branch from 73356ed to e8ad0fe Compare June 21, 2024 14:01

dsavinov-actionengine commented Jun 21, 2024

View reviewed changes

dsavinov-actionengine requested a review from kbairak June 24, 2024 09:49

dsavinov-actionengine added 2 commits June 25, 2024 16:26

Adding support for ARB format

36a309b

QA spreadsheed issues 1,2,3

7941dd7

kbairak force-pushed the support_arb branch from 150cbde to 7941dd7 Compare June 25, 2024 13:26

kbairak approved these changes Jun 25, 2024

View reviewed changes

kbairak enabled auto-merge June 25, 2024 13:26

kbairak merged commit 385d505 into transifex:devel Jun 25, 2024
3 checks passed

txsentinel mentioned this pull request Jun 25, 2024

Release: 0.0.123 #342

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for ARB (Application Resource Bundle) (.arb) format #338

Adding support for ARB (Application Resource Bundle) (.arb) format #338

dsavinov-actionengine commented May 24, 2024

dsavinov-actionengine commented May 24, 2024

kbairak left a comment

kbairak Jun 12, 2024 •

edited

Loading

dsavinov-actionengine Jun 21, 2024

kbairak Jun 12, 2024

dsavinov-actionengine Jun 21, 2024

kbairak Jun 12, 2024

dsavinov-actionengine Jun 21, 2024

kbairak commented Jun 12, 2024

dsavinov-actionengine commented Jun 21, 2024

dsavinov-actionengine Jun 21, 2024

dsavinov-actionengine Jun 21, 2024

dsavinov-actionengine Jun 21, 2024

Adding support for ARB (Application Resource Bundle) (.arb) format #338

Adding support for ARB (Application Resource Bundle) (.arb) format #338

Conversation

dsavinov-actionengine commented May 24, 2024

Problem and/or solution

How to test

1. Running unit-tests

2. Through testbed

Reviewer checklist

dsavinov-actionengine commented May 24, 2024

kbairak left a comment

Choose a reason for hiding this comment

kbairak Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbairak commented Jun 12, 2024

dsavinov-actionengine commented Jun 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbairak Jun 12, 2024 •

edited

Loading